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Abstract 

This paper presents a highly efficient decomposition scheme and its associ- 
ated Mathematica notebook for the analysis of complicated quantum circuits com- 
prised of single/multiple qubit and qudit quantum gates. In particular, this scheme 
reduces the evaluation of multiple unitary gate operations with many condition- 
als to just two matrix additions, regardless of the number of conditionals or gate 
dimensions. This improves significantly the capability of a quantum circuit anal- 
yser implemented in a classical computer. This is also the first efficient quantum 
circuit analyser to include qudit quantum logic gates. 



1. Program Summary 

Title of program. CUGates.m 
Programming language used: Mathematica 

Computers and operating systems: any computer installed with Mathematica 6.0 
or higher 

Distribution format: Mathematica notebook 

Nature of problem: The CUGates notebook simulates arbitrarily complex 

quantum circuits comprised of single/multiple qubit and qudit quantum gates. 
Method of solution: It utilizes an irreducible form of matrix decomposition for a 

general controlled gate with multiple conditionals and is highly efficient in 

simulating complex quantum circuits. 
Running time: Details of CPU time usage for various example runs are given in 

Section 4. 

Program obtainable from: CPC Program Library, Queens University of Belfast, 
N. Ireland 
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2. Introduction 

At the heart of a quantum computer lies a set of qubits and/or qudits whose 
states are manipulated by a series of quantum logic gates, namely a quantum cir- 
cuit, to provide the ultimate computational results. It is therefore of particular 
interest to be able to efficiently evaluate the performance of a quantum circuit 
(such as its reliability, effectiveness, robustness, sensitivity to decoherence and 
errors) in the design stage using a classical computer. 

There are currently several quantum computer simulators reported in the liter- 
ature djU^HsLH,!!!], which simulate quantum circuits consisting of 1, 2 or 3 qubit 
gates such as the Hadamard, CNOT and Toffoli gate. The CNOT and Toffoli gate 
are examples of controlled unitary gates (CUGs), which implement operations 
that are conditional on the state of the specified control qubits. Other more gen- 
eral CUGs (acting across qubits or qudits) can always be decomposed in terms 
of a universal set of 1- and 2-qubit quantum gates fl6Q, but this would require sig- 
nificant computational overhead in the analysis. To the best of our knowledge, 
there are no efficient quantum simulators on quantum circuits with multiple qudit 
controlled quantum gates. 

In this paper, we present a highly efficient scheme for the evaluation of arbi- 
trary CUGs. This scheme reduces the evaluation of multiple unitary gate opera- 
tions with many conditionals to just two matrix additions, regardless of the num- 
ber of conditionals or gate dimensions. The implementation of this scheme, and 
many other functions used to analyse quantum circuits, is provided in a Mathemat- 
ica 7.0 package entitled CU Gates. m. The computation time required to evaluate 
the CNOT and Toffoli gates using this package is compared with the QDENSITY 
package [1] and is found to be several orders of magnitude more efficient. Exam- 
ples of quantum circuits involving controlled unitary gates and their analysis using 
the notebook are presented. A compilation of the Mathematica code presented in 
the paper is provided in the Mathematica notebook CUGates.nb. 

3. Decomposition of CUGs 

3.1. CUGs across qubits 
3.1.1. Definitions and notation 

Denote a set of qubits as Q, and the wavefunction (if definable) for the ith 
qubit as | y/j). Q is in a basis state iff | yi) = |0) V | y/j) = 1 1) V * G Q. Define C c < as 
being conditional on the state 1 1) of qubit c, , and C Cj as being conditional on the 
state |0) of qubit cj. 
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A CUG with conditionals C Clr " ,c 'C Cl >-> c j implementing unitary operations 
Uj* k , where «i , denotes the starting qubit of the corresponding U block, 
is represented by C Cl, "" ,c 'C Clr "' ,c -'t/" 1 ■■■U? k . Effectively, the action of this CUG is 
such that it implements the operations U" l ...U? k iff the set of control qubits is in 

the basis state described by |l/A C] c .) = |1) and h^ci,...,c/) = |0). For any other 

basis state, the CUG leaves the system of qubits unchanged. Figure Q] shows an 
example of the C l C 3 ' 6 UfU% gate. 

IVi) T 



¥2) — Ui 



U 2 



V3> 

¥a) 

¥e) 

Figure 1: The C'C 3 ' 6 ^ 2 ?/^ gate, with C conditional on qubit 1 being 1 1), C conditional on qubit 3 
and 6 being |0), and the operations U\ and U2 are implemented on qubits 2 and 4 to 5 respectively. 



3.1.2. Decomposition 

An efficient way to evaluate arbitrary controlled unitary gates is to decompose 
the operation by defining the projection operators Pq and Pi as: 
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Note that Pq and Pi are non-unitary matrices and Pq + Pi = h is the 2-by-2 identity 
matrix. Now consider the C l U 2 (abbreviated as CU) gate, shown in Figure [2l 
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Figure 2: The CU gate. 

It can be readily verified and proven that the matrix for the CU gate is given 
as CU — Pq <g) I2 + P\ <8> U (see appendix A for details). This result, called the 
decomposition of the CU gate as a sum, is graphically summarised in Figure [3j 
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Figure 3: Decomposition of the CU gate. 



The key idea is that we can use the projection operators Pq and P\ to project 
the set of control qubits to a basis state. For any basis state, the action of the 
CUG gate is either just the t/" 1 ■■■U? k operators, or no action at all (i.e. the identity 
operator). By considering all possible basis states of the set of control qubits, we 
can construct the matrix of the CUG gate by summing together the action of the 
CUG gate corresponding to each possible basis state. 

For any arbitrary C Cl, '" ,c 'U" 1 ...U^ k gate, consider replacing each conditional 
with a Pq or P\ operator. This can be done in 2' distinct ways. For the basis 

state described by | \ff C] c .) = 1 1 ) , which corresponds to the permutation C m — > Pi 

V m = ci,...,c/, the action of the CUG is the operations U^.-.U^. Any other 
basis state (and hence permutation) corresponds to the action of the CUG being 

the identity operator. The sum of the 2 l permutations yields the matrix of the 

C c u 



i U l ■■■U k gate. For example, 



C 1,3 t/ 2 



PQ®h®Po+Po®h ®P\ +Pi ®h®Po+Pi®U®P h 
as graphically shown in Figure @] (see appendix B for a mathematical proof). 
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Figure 4: Decomposition of the C 1,3 t/ 2 gate. 



Similarly, for any arbitrary C cl, — ,c jU" l ...U^ k gate, consider the 2 J possible 
permutations that arise from replacing each C conditional with a Pq or P\ operator. 
For the basis state described by \ Yci,—,ci) = 1^)' w hich corresponds to the permu- 
tation C n — > Pq V n = c\ ,—,cj, the action of the CUG is the operations U"K..U^ k . 
Any other basis state corresponds to the action of the CUG being the identity op- 
erator. The sum of the 2 7 permutations gives the matrix of the C Clr " ,c i~U™ 1 ..XJ? k 
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gate, for example 



as 



graphically shown in Figure [51 
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Hence, for any arbitrary C c ' Ir ••> Ci C Cl, "' ,c iU" 1 ...U? k gate, we consider each of the 
2 l +J permutations that arise from replacing each C and C conditional with a Pq or 
Pi operator. For the basis state described by \\jf C i,...,ci) = I 1 ) an d |Vci,...,c;) = |0), 
which corresponds to the permutation C m — >■ Pi V m = ci, ...,Cj and C" — >■ Po V 
n = ci, ...,c 7 , the action of the CUG is the operations U" [ ...U^ k . Any other basis 
state corresponds to the action of the CUG being the identity operator. The sum of 
the 2 ,+J permutations yields the matrix of the C Cu —> Ci C Cl >'"> c JU" 1 ...U^ k gate. For 
example, 

C 3 C 1 U 2 = P ® 7 2 <g> P + P) ® U ® Pi + Pi <g> 7 2 ® Po + A ® /2 ® ^i , 
as graphically shown in Figure [6] 
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Figure 6 


Decomposition of the C^C 1 !/ 2 £ 


;ate. 





3.1.3. Reduction to its irreducible form 

For an arbitrary C Cu •■•> c 'C Cl > ■" ,c iU" 1 ...U^ k gate, a naive implementation of the 
previous section would require 2 ,+J — 1 matrix additions to compute the matrix 
of the gate. However, this overhead can be reduced significantly by noting that 
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only one permutation has the U" l ...U^ k operators being implemented, while the 
other 2 ,+J — 1 possible permutations have identity operators substituted in for the 
U" l ...U^ k operators. As an example, consider the gate (essentially the identity 
matrix /§) shown in Figure [71 which has 2 2 — 1 of the same permutations as in 
Figure [61 Consequently, we can write the decomposition of the C 3 C l U 2 gate as 
the following 

C 3 C 1 U 2 = Ig+P ®U® Pi-Po^h ®Pi, (1) 
which is graphically represented by Figure [8l 
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Figure 7: Decomposition of the 1% gate. 
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Figure 8: Optimized decomposition of the C^C 1 U 2 gate. 



For the general case, the matrix of any arbitrary C Cu " ,,Ci C Cl >'"> c JU^ 1 ...U^ k gate 
is simply the identity matrix (of appropriate dimensions), added together with the 
permutation that has the operations U^...U^ k implemented, subtracted with the 
same permutation with the U" 1 ...U? k operators replaced with identity operators. 
In effect, the identity matrix is used to encapsulate 2 i+; — 1 permutations. Hence, 
for any arbitrary C clr *' ,Ci C clr '*' c -'[/" 1 ...?7^ gate, computation of the gate matrix 
requires only two matrix additions, regardless of the number of controls or the 
gate dimensions. Note that the only instance in which this decomposition scheme 
is less efficient than the naive implementation is when only one C or C condi- 
tional is involved. The optimized decomposition of a more complex example, the 
C C ' U 3 1/| gate, is given in Figured 
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Figure 9: Optimized decomposition of the C 2 C XA UlU2 gate 



3.2. CUGs across audits 
3.2.1. Definitions and notation 

Denote the wavefuntion of the /-level qudit j as 



) . Define a quantum circuit 



consisting of n qudits 



Yi 



Yi 



where Q represents the number 



of levels in the ith qudit and £ = {£1, £2, Cn}- We call £ the quantum circuit 
profile, which is the list of qudit levels, arranged according to the order of the 
qudits. For example, any CUG applied across qubits has £ = {2,2, ...,2}, since 
qubits are 2-level qudits. Also define as being conditional on the state \sj — 1} 
of qudit Cj, where 1 < Sj < £ C( . 

A CUG with conditionals C^ 1 . ..C^ implementing unitary operations t/" 1 . -Ur k , 
where u\, ui denotes the starting qudit of the corresponding U block, is repre- 
sented by Csl...C c s >U" 1 ...U£ k . Figure [TO] shows an example of the C\C\C\v\\j\ 
gate. 

3.2.2. Decomposition 

We can readily extend the concept of projection operators to qudits, by defin- 
ing (P a ,b) i j = SaiSaj V 1 < i, j < b (where 8ij is the Kronecker delta) as the projec- 
tion to the state \a — 1) acting on a Z?-leveled qudit, with the restriction 1 < a <b. 

Hence every ^-leveled qudit has a set of b projection operators defined, with the 

b 

property £ P a>b = I b . 



a— l 



For a general C^.-.C^t/" 1 —U£ k gate, it is clear that by substituting each con- 
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Figure 10: The C\C\ C^U\ gate, acting on 5 qudits of various levels. The quantum circuit profile 
is C = {3,4,5,5,2}. 



ditional with a (valid) projection operator, it would result in C,p = Y\ Ccj distinct 

permutations. However, since the unitary operations U" l ...U^ k are only carried 
out iff the control qudits ci,...,q are in the states \s\ — 1 ),..., \si — 1) respectively, 
only the permutation described by C\\ — >■ P s . r V j = 1, exactly will have 

U" 1 ■■■Uj* k implemented; any other permutation will have identity operators sub- 
stituted in place of U" 1 ...U^ k . The sum of all C,p permutations yields the matrix of 
the C c s \...C c s \U" l ...Ul k gate. For example, 

C\C\ U 2 =P i j®I 5 ®Pi2+P i 3®I 5 ®P 2 ,2 J rP23®h®Pl2 J r (2) 
^2,3 ® h ® Pl2 + P3,3 ®U®P 1>2 + P 3 ,3 ®h® Pl2 

as graphically shown in Figure [TT] 
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Figure 11: Decomposition of the C^ClU 2 gate. 
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3.2.3. Reduction to its irreducible form 

As before, we can use the identity matrix (of appropriate dimensions) to en- 
capsulate C,p — 1 permutations of a C c s \ ...C C S \U^ ...U^ k gate, since only one of the 
permutations have the U" 1 ■■■U? k operations implemented. The matrix of the CUG 
is thus the identity matrix (of appropriate dimensions) added together with the 
permutation described by CA — )■ P s . r V j = 1, i, minus the same permutation 

with identity matrices substituted in place of U" 1 ...U^ k . The optimized decompo- 
sition of the C\C\U 2 gate is given in Figure [T2l The optimized decomposition of 
a more complex example, the C^ClC^UlU* gate, is given in Figure [131 
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Figure 12: Optimized decomposition of the C\C\ U 2 gate. 
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Figure 13: Decomposition of the C\C\C\lj\lJ^ gate. 



4. Comparison with the QDENSITY package 

The QDENSITY package [1] provides many functions for the simulation of 
quantum circuits, two of which simulate the CNOT gate and the Toffoli gate. A 
more recent paper [0] introduces QCWAVE as an extension of the QDENSITY 
package. QCWAVE has the functions Op2 and Op3 that can be used to reproduce 
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the action of CNOT n and Toffoli„ gates on state vectors, but does not give the 
matrix of the gates itself. We find it more straightforward and efficient to use 
the QDENSITY functions to construct the matrix and then act on the state vector, 
and hence we perform the following comparison using the QDENSITY package 
of version 4.0 (updated since yl). 

Here, we compare the CPU time taken to compute the matrices for the same 
gates, using the CNOT and Toffoli functions provided in the QDENSITY pack- 
age and the more general CUGate function provided in the CUGates.m package. 
The QDENSITY functions implements a decomposition using many more matrix 
additions and list manipulations in comparison with the scheme described in this 
paper. 

We define the CNOT n gate as spanning n qubits with the C control located 
at the first qubit, and the NOT gate located at the n th qubit. The Toffoli,, gate is 
defined as spanning n qubits with the C controls at qubits 1 and 2, and the NOT 
gate located at the n th qubit. Using these definitions, we are able to measure the 
CPU time taken to compute the matrix against n, which is plotted in Figure [141 

1000 e 1 
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Number of qubits, n 



Figure 14: CPU time taken. Circle/Square: Time taken using the CNOT/Toffoli function in the 
QDENSITY 4.0 package. Diamond/Triangle: Time taken using the CUGate function in the CU- 
Gates.m package with the sparse-matrix optimization to compute the CNOT/Toffoli gate. 



10 



As demonstrated in Figure [H the CUGate function is significantly faster (by 
several orders of magnitude) than the CNOT and Toffoli functions provided in the 
QDENSITY package. In the actual Mathematica implementation of the CUGate 
function, we utilized sparse-matrix optimization to maximize calculation speed, 
which in this case, provides a speedup of about 1.8 compared to the CUGate func- 
tion without the sparse-matrix optimization. It is also worth noting that while the 
Toffoli function takes almost twice as long as the CNOT function to compute its 
result, the CUGate function takes approximately the same length of time to com- 
pute the matrix of a CNOT and Toffoli gate for any particular n, which is expected 
from the decomposition scheme described in this paper. In general, computation 
of the matrix of any two controlled unitary gates spanning the same number of 
qubits using the CUGate function takes the same length of time. 

To perform this analysis, we have timed the use of the functions in Mathemat- 
ica using the Timing function, averaged over 10 trials. Computations were done 
on a laptop with an Intel Core i7-740QM processor with a speed of 1.73GHz. 
Results for n < 10 using the CUGate function is omitted since the minimum gran- 
ularity of the Timing function is more than the CPU time needed for the CUGate 
function. 

5. Worked examples 

First load the CUGates.m package in Mathematica using the following syntax: 

in[i] := Needs["CUGates"'] 

Brief descriptions of each function included in the CUGates.m package can be 
accessed using the '?' operator. For example, 

In[2] := ?CUGate 
Out[2] := CUGate[cpos,cbarpos,ubegin,umatrix] 

Returns the matrix of a CUG across qubits with C conditionals at epos, 
C conditionals at cbarpos, and unitary operators umatrix with the 
corresponding starting positions ubegin. 
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In[3] := ?CUGateG 
Out[3] := CUGateG[qcp,clist,ubegin,umatrix] 

Returns the matrix of a CUG across qudits with conditionals described by clist, 
and unitary operators umatrix with the corresponding starting positions ubegin. 
Note: clist is a list of {Index of qudit in qcp,Conditional state} 



The qubit- specific subroutines are: BasisState Vector, CUGate, EqualSuper- 
position, HadamardGate, ListStates, MeasureQubits, MeasureSingleQubit, NOT- 
Gate, PHASEGate, SWAPGate and SWAPQubits. 

The general qudit subroutines are: BasisState VectorG, CUGateG, EqualSu- 
perpositionG, ListStatesG, PHASEGateG, POp, QFTMinus, QFTPlus, RMinus, 
PvPlus and SWAPQudits. The definitions for the functions QFTMinus and QFT- 
Plus are similar to that of the QFT operator defined in 

5. 1 . Shor 's algorithm 

Figure [15] shows the implementation of Shor's algorithm to factorize N = \5 
for co-prime, C = 7 |@]. 



|0) 



H 



|0> -H 



|o> -{W 

10) - 

|o) - 

|o> — 

ll> — 



-0- 



^ 9 



-0 



k/2 



H 



7t/4 - k/2 H 



Figure 15: Quantum circuit for Shor's algorithm, N =15 and C = 7. 
Using Mathematica, we first initialize the qubit states as follows: 

In[4] := InputVector = BasisState Vector[{0,0,0,0,0,0,l}]; 

HTransform = KroneckerProduct[HadamardGate[],HadamardGate[], 
HadamardGate[],IdentityMatrix[2 4 ]]; 
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Modular exponentiation is carried out on qubits 4 to 7 below: 

In[5] := ModA = KroneckerProduct[IdentityMatrix[2 2 ], 

CUGate[{3},{},{5},{NOTGate[]}],IdentityMatrix[2 2 ]]; 
ModB = KroneckerProduct[IdentityMatrix[2 2 ], 

CUGatetiai^l^el^NOTGateGllJdentityMatrixP 1 ]]; 
ModC = KroneckerProduct[IdentityMatrix[2 3 ], 

CUGate[{4},{},{6},{NOTGate[]}],IdentityMatrix[2 1 ]]; 
ModD = KroneckerProduct[IdentityMatrix[2 1 ], 

CUGate[{2,6},{},{4},{NOTGate[]}],IdentityMatrix[2 1 ]]; 
ModE = ModC; 

ModF = KroneckerProduct[IdentityMatrix[2 4 ], 

CUGate[{7},{},{5},{NOTGate[]}]]; 
ModG = KroneckerProduct[IdentityMatrix[2 1 ], 

CUGate[{2,5},{},{7},{NOTGate[]}]]; 
ModH = ModF; 

Next, the inverse QFT (Quantum Fourier Transform) is performed on the first 
three qubits: 

In[6] := QftA = KroneckerProduct[HadamardGate[],IdentityMatrix[2 2 ]]; 

QftB = KroneckerProduct[CUGate[{l},{},{2},{PHASEGate[a:/2]}], 

IdentityMatrix[2 5 ]]; 
QftC = KroneckerProduct[IdentityMatrix[2 1 ],HadamardGate[], 

IdentityMatrix[2 5 ]]; 
QftD = KroneckerProduct[CUGate[{l},{},{3},{PHASEGate[?r/4]}], 

IdentityMatrix[2 4 ]]; 
QftE = KroneckerProduct[IdentityMatrix[2 1 ], 

CUGate[{2},{},{3},{PHASEGate[a:/2]}],IdentityMatrix[2 4 ]]; 
QftF = KroneckerProduct[IdentityMatrix[2 2 ],HadamardGate[], 

IdentityMatrix[2 4 ]]; 
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We then multiply the matrices together from right to left, apply it to an initial 
qubit states, and obtain the final state of the quantum register. 



In[7] := TMatrix = QftF.QftE.QftD.QftC.QftB.QftA.ModH.ModG. 

ModF.ModE.ModD.ModC.ModB.ModA.HTransform; 
OutputVector = TMatrix.InputVector; 
ListStatesfOutput Vector] ; 

Out[7] := List of qubit states with a non-zero amplitude: 

(i) 10000001) + (i) 10000100) + (I) 10000111) + (i) |0001101) + 
(1) 10010001) + (i) |0010100) + (-i) 10010111) + (-i) |0011101) + 
(i) 10100001) + (-|) 10100100) + (I) 10100111) + (-|) |0101101) + 
(I) 10110001) + (-i) 10110100) + (-|) 10110111) + (I) 10111101) 



The most important part of the result is the state measurement of qubits 1, 2 
and 3, which constitute the output register. Upon measurement, qubit 1 is solely 
in the computational basis |0), whereas qubits 2 and 3 are in a mixture of both 
computational bases, |0) and |1). Written in reverse order, we have a superpo- 
sition of the combined states |000), |010), 1 100), and 1 1 10) for the three qubits 
in the output register, which has a periodicity of p = 2. According to Shor's 
algorithm, the factors are then given by the greatest common divisor (gcd) of 

2»— 1 

C p ± 1 and N, where n = 3 is the number of qubits in the output register. There- 
fore gcd(C^p~ ± 1,/Y) = gcd(l^r- ± l, 15) = gcd(l 2 ± 1, 15) = 3,5, which are 
indeed the factors ofN= 15. 



5.2. Quantum random walks 

Here, we are concerned with the quantum circuit implementation of quantum 
walks on highly symmetrical graphs. There exists different software packages that 
can implement quantum random walks across graphs, e.g. the QWalk package 
implements a quantum walk across 1-dimensional and 2-dimensional lattices II 1 011 
and the qwViz package visualize a quantum walks on arbitrarily complex graphs 
[| 1 1|1 - as well as various quantum state based physical implementation schemes 
such as described in (vl, 13], without reference to a circuit implementation of the 
graph. However, we consider circuit implementations here to illustrate the use of 
the CUGates package. 
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5.2.1. 16-length cycle 

Consider the quantum circuit shown in Figure [16l which implements a quan- 
tum walk on a 16-length cycle using the Increment/Decrement gates [11411 shown 
in Figure [FT] First, we define the functions IncrementGate and DecrementGate in 
Mathematica as below to calculate the matrix of the Increment/Decrement gate, 
given the number of qubits involved. 
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Figure 16: Quantum circuit implementing a quantum walk along a 16-length cycle. 
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Figure 17: Increment and decrement gates on n qubits, producing cyclic permutations in the 2" 
bit-string states. 
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In[8] := IncrementGate[NQubitJnteger] := 

( 

Module 

[ { ReturnMatrix,i j }, 
ReturnMatrix = IdentityMatrix[2 w 2" W; ]; 
For[i = 1, i < NQubit, ++i, 
ReturnMatrix = KroneckerProduct[IdentityMatrix[2 !_1 ], 
CUGate[Table[j,{j,i+l,NQubit}],{},{i}, 
{NOTGate[]}].ReturnMatrix; 

]; 

Return[KroneckerProduct[IdentityMatrix[2 iVe " 4 "' 
NOTGate[]].ReturnMatrix]; 

] 

) 

In[9] := DecrementGate[NQubitJnteger] := 

( 

Module 

[ { ReturnMatrix,i j } , 
ReturnMatrix = IdentityMatrix[2 A ' e " / " f ]; 
For[i = 1, i < NQubit, ++i, 
ReturnMatrix = KroneckerProduct[IdentityMatrix[2 ! '], 
CUGate[{},Table[j,{j,i+l,NQubit}],{i}, 
{NOTGate[]}].ReturnMatrix; 

]; 

Return[KroneckerProduct[IdentityMatrix[2 A ' e " 4 ' ! ' 
NOTGate[]].ReturnMatrix]; 

] 

) 
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Using these definitions, we calculate the matrix of the circuit and apply it to the 
state vector signifying the initial vertex to be the 9th vertex (node representation 
of 1 10000)) with the subnode initially set to |0). 

In[10] := InputVector=BasisStateVector[{l,0,0,0,0}]; 

Coin = KroneckerProduct[IdentityMatrix[2 4 ],HadamardGate[]]; 
Tl = CUGate[{5},{},{l},{IncrementGate[4]}]; 
T2 = CUGate[{},{5},{l},{DecrementGate[4]}]; 
TMatrix = T2.Tl.Coin; 
OutputVector = TMatrix.InputVector; 
ListStates[Output Vector] ; 
Out[10] := List of qubit states with a non-zero amplitude: 
(£) |01100)+ (£) |10011) 

From the output, we can see that the initial state 1 10000) has been shifted to a 
superposition of states |01 100) and 1 10011), which are the nodes adjacent to the 
initial state in a 16-length cycle. Further iterations will cause the quantum walk 
to propagate further along the cycle, with each state simultaneously moving to its 
adjacent states. 

5.2.2. Complete 2? -graph with self-loops 

As an example involving qudits in a quantum circuit, we analyze the quan- 
tum walk along the complete 3"-graph with self loops as discussed in II 1411 . The 
complete 3 3 -graph with self-loops can be constructed as in Figure [T8l 

Here, the operator T± is defined as (T±) a b = ^ e± ^~ V 1 < a,£> < 3, and the 
quantum circuit profile is now £ = {3, 3, 3, 3, 3, 3, 2}. This can be implemented in 
Mathematica as follows: 

In[ll]:= TMinus = QFTMinus[3]; 
TPlus = QFTPlus[3]; 
QCProflle = {3,3,3,3,3,3,2}; 

The coin operator is calculated as follows: 
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In[12] := CI = SparseArray[KroneckerProduct[IdentityMatrix[3 3 ], 

TPlus,TPlus,TPlus,IdentityMatrix[2]]]; 
C2 = SparseArray[KroneckerProduct[IdentityMatrix[3 3 ], 

CUGateG[QCProflle,{{4,l},{5,l},{6,l}},{7},{NOTGate[]}], 

IdentityMatrix[2]]]; 
C3 = SparseArray[KroneckerProduct[IdentityMatrix[3 5 ], 

CUGateG[QCProflle,{{7,l}},{6},{PHASEGateG[{a:,0,0}]}]]]; 
C4 = C2; 

C5 = SparseArray[KroneckerProduct[IdentityMatrix[3 3 ], 
TMinus,TMinus,TMinus,IdentityMatrix[2]]]; 



The shifting operator can be implemented as such: 

In[13] := Tl = SparseArray[KroneckerProduct[SWAPQudits[QCProflle,l,4], 

IdentityMatrix[3 2 *2]]]; 
T2 = SparseArray[KroneckerProduct[IdentityMatrix[3], 

SWAPQudits[QCProflle,2,5], IdentityMatrix[3 * 2]]]; 
T3 = SparseArray[KroneckerProduct[IdentityMatrix[3 2 ], 

SWAPQudits[QCProflle,3,6],IdentityMatrix[2]]]; 
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Finally, we can calculate the matrix of the circuit, and apply it to the state 
vector signifying the initial vertex to be the 1st vertex (node representation of 
|000)), and obtain the result of a single iteration of the circuit. 

In[14] := InputVector = BasisStateVectorG[QCProflle,{0,0,0,0,0,0,0}] 
TMatrix = Normal[T3.T2.Tl.C5.C4.C3.C2.Cl]; 
OutputVector = TMatrix.InputVector; 
ListStatesG[QCProfile,OutputVector]; 

Out[14] := List of qudit states with a non-zero amplitude: 
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5.2.3. 3 generation 3-Cayley tree 

As a further example involving a mixture of qubits and qudits, we demonstrate 
how to implement a quantum walk on the 3 rd generation 3-Cayley tree (shown in 
Figure [19]) with the central node marked, by using its corresponding quantum 
circuit shown in Figure l20l 

The G n operator is defined as (G n )ij = \ — $ij V 1 <!,;'< n. Here, the G3 
operator acts on only 3 of the 4 subnode states, so it does not mix with the state 
1 1 1 ) . The ^ + and gates are generalized increment and decrement gates re- 
spectively. For a Z?-leveled qudit, they are defined as b-by-b matrices given as 
(M+)ij = S /(modb)+1 and = 5 lJ(modb)+1 respectively. A natural ex- 

tension to multiple qudits is given in Figure ED In general, the Rr and Rl oper- 
ators, shown in Figure [22l correspond to a clockwise and anticlockwise rotation 
of qudits. However, in the context of Figure l20l Rr and Rl are both single SWAP 
gates. 

Given the length of the code needed to simulate the quantum circuit for a 
quantum walk along the 3-Cayley tree, we refer the reader to the Mathematica 
notebook CUGates.nb. The results of the quantum walk across 50 steps (starting 
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Figure 19: 3* 



generation 3-Cayley tree. 



in an equal superposition of vertex states, which is then subdivided according to 
the subnode states of the vertex) is shown in Figure [23l where the centre marked 
node is distinguished by its much larger probability peak. 

6. Conclusions 

The Mathematica notebook presented in this paper utilizes an irreducible form 
of matrix decomposition of a general controlled quantum gate with multiple con- 
ditionals and is highly efficient in simulating complex quantum circuits. It pro- 
vides a powerful tool to assist researchers analyze the performance of proposed 



described in [|14il . which was addressed and acknowledged in II 1511 . Another im- 
portant application in which large and complex circuits need to be efficiently sim- 
ulated is in the area of quantum error correction, in which generalized control uni- 
tary gates are used with both qubits and qudits lfl6[[l7ll. This package will prove 
to be immensely helpful in the design of codification circuits in this area. Imple- 
mentation in Mathematica allows the code to be used in a cohesive and interactive 
environment which is nevertheless computationally powerful. The interactive na- 
ture of this environment also makes this notebook suitable for teaching, where 



quantum circuits. It has helped to identify several errors in the 
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Figure 20: Quantum circuit implementing a quantum walk along a 3 rd generation 3-Cayley tree, 
with the central node marked. Any vertex is uniquely defined by a combination of the level, tree 
number, and node states. 



quantum algorithms and quantum gate operations can be studied in detail. 
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Figure 21: 31 + and gates on n qudits, with a quantum circuit profile of £ = {£1, £2, Cn}- 
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Figure 22: and gates on n qudits, with a quantum circuit profile of £ = {£1, £2, • C«}- 



22 



'robability 




Figure 23: Probability distribution along the 3 generation 3-Cayley tree against the number of 
walking steps. 
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Appendices 



Appendix A. CU gate decomposition proof 



For any arbitary state, Po («|0) +b\l)) H>a|0) and Pi (a\0)+b\\)) ^b\l), i.e. 
the Po an d Pi operators projects arbitrary states onto the |0) and |1) computa- 
tional basis state respectively. Consider the quantum circuit in Figure [AT241 where 
\yri) =ai|0)+fci|l) and | y 2 ) = a 2 \0) + b 2 \l). 



IVi> ^ p o 



\Y2) 



U 



h 



Figure A. 24: Application of Po to the CU gate. 
SincePodv/i)) a\ |0), then 

CU (P (| yi» ® Wi)) H- fli|0) ® |y 2 } = Po (|Vi» (|V^» 

i.e. the £/ gate is not applied to the second qubit because the control qubit is in 
the state a\ |0) after the application of the Po operator, and thus the action of the 
CU gate is the identity operator. Hence, we can simplify the circuit, as shown in 
Figure lA24\ 

Similarly, if the Pi operator is applied as in Figure [A2~5l then Pi ( | x^i ) ) 1— > b\ | 1 ) 
and thus 

Ct/(^(|vri))®|v^))^*i|l)®t/(|^))=^(|Vi))®^(IVi)), 

because the control qubit is in the state b\\l) after the application of the Pi opera- 
tor, so the action of the CU gate is the U 2 operator. The equivalent circuit is also 
shown in Figure |AT25l 
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Figure A. 25: Application of Pi to the CU gate. 



26 



Note that M\ and M 2 , as defined in Figures IA.24I and IA.25I respectively, are 
non-unitary. However the sum M\ + M 2 = Pq <8> h + Pi ® U is unitary and also 

M l +M 2 = CU(P ®I 2 )+CU(P l ®I 2 ) (A.l) 
= CU{(P + P 1 )®I 2 ) (A.2) 
= CU{I 2 ®h) (A.3) 

= cc/ 

Consequently, CU = P <g)I 2 + Pi®)U. 

Appendix B. C 13 L^ 2 gate decomposition proof 

The decomposition can be derived by considering each of the possible permu- 
tations, which are defined as follows: 

Mj = C 1 > 3 U 2 (P ®I 2 ®P ) = P ®h®P (B.l) 

M 2 = C i - ? U 2 {Po®h®P l )=P Q ®I 2 ®Pi (B.2) 

M 3 =C L3 U 2 (Pi ®/ 2 ® P ) =Pi®h®Po (B.3) 
M 4 = C 1 ' 3 U 2 (P l ®I 2 ®Pi)=Pi®U®P i 

As before, we consider the permutation sum : 

M l +M 2 +M 3 +M 4 = C l > 3 U 2 (Po®I 2 ®P )+C l > 3 U 2 (Po®I 2 ®P l ) + (BA) 

C 1 ' 3 ^ 2 (Pi ®/ 2 ® P ) +C 1 ' 3 t/ 2 {Pi®I 2 ®Pi) (B.5) 



= C 1 > 3 U 2 (P ®I 2 ®(P +P l ) + (B.6) 

Pi®/ 2 ®(P + Pi)) (B.7) 

= C 1 ' 3 [/ 2 (( J P + / 5 i)®/2®/2) (B.8) 

= C l - ? U 2 (I 2 ®I 2 ®I 2 ) (B.9) 
= C 1 ' 3 f/ 2 



Consequently, C l ^ 3 U 2 = Pq®I 2 ®Pq + Pq®I 2 ®Pi+Pi®I 2 ®Pq + Pi®U ®Pi. 

Appendix C. Arbitrary CUG decomposition 

For an arbitrary CUG across qubits with k conditionals, we have 2 k possible 
permutations when placing a Pq or Pi projection operator in front of each condi- 
tional. Each permutation then has a column that is described by the tensor prod- 
uct of the projection operators with identity matrices in the appropriate positions 
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placed in front of the CUG. Proving that the sum of these permutations is equal to 
the gate itself is fairly trivial; it simply involves factoring together permutations 
that differ by a single conditional, using the identity Pq + P\ = h, and then doing 
so repeatedly until we end up with the original CUG. The simplification comes 
by considering the action of the projection operators on the state going into the 
CUG, and since the CUG implements the action iff the input state is in the ba- 
sis state corresponding to the conditionals, we can easily work out which of the 
permutations has the action of the CUG implemented, while the rest do not. 

Similarly, for an arbitrary CUG across qudits, we have a number of permuta- 
tions corresponding to the qudit levels on which the conditionals are placed, and 
b 

by using the identity ^ P a ^ = 4, we can readily prove that the sum of all permu- 
te 1 

tations corresponds to the CUG itself, and can thus simplify the permutations as 
before. In both cases, we can simplify the decomposition considerably by using 
the identity matrix to represent the sum of all permutations with no action ap- 
plied, then adding on the appropriate permutation with the action of the CUG and 
subtracting the same permutation without the action. 
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